Placing structuring elements in a word sequence for generating new statistical language models
نویسندگان
چکیده
Class based n-gram language models have been applied successfully in speech technology. We will present an automatic method to improve n-gram language models by distributing structural elements in a new way in word sequences. Our algorithm works on textual data consisting of two different kinds of text elements, namely words and structural elements. The order of words will not be changed during the iterations. Only structural elements can be inserted or deleted by the algorithm between any two items in the data. Thus unseen n-grams will be interpolated by n-grams containing structural elements. We give a detailed description of the algorithm and present first results of a system trained on a small corpus.
منابع مشابه
Iranian Advanced EFL Learners’ Awareness and the Use of Marked Word Order: Discourse-pragmatically Motivated Variations
The present investigation was designed to study the production and comprehension of specific means for information highlighted by advanced Iranian learners of English as a Foreign Language. The study focused on the discourse-pragmatically motivated variations of the basic word order such as inversion, pre-posing, it- and Wh-clefts. After taking the Nelson test, a homogeneous group was settled. ...
متن کاملConceptual Metaphoric Language Use in Structuring Political Discourse in Iran-West Relations: A CDA Perspective
The present study was carried out with the purpose of examining the role of metaphorical language in the critical discourse analysis (CDA) of political texts based on a modern framework postulated by Kövecses (2015). The corpus of the study consisted of thirty-thousand words chosen as a textual sample to see which source conceptual domains are used and what generic/discursive attributes emerge ...
متن کاملSchemata-Building Role of Teaching Word History in Developing Reading Comprehension Ability
Methodologically, vocabulary instruction has faced significant ups and downs during the history of language education; sometimes integrated with the other elements of language network, other times tackled as a separate component. Among many variables supposedly affecting vocabulary achievement, the role of teaching word history, as a schemata-building strategy, in developing reading comprehensi...
متن کاملThe Impact of Teachers' Training on the Reliability of Tests and Assessments in Governmental and Non-governmental Sections
Assessment is considered as one of the fundamental elements in the field of foreign language acquisition. In order for communication take place, adequate number of vocabulary is needed to be known by the learners. The salient role of vocabulary in the field of foreign language acquisition resulted in the publication of several hundreds of papers and dozens of books. Due to the dominant role of ...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کامل